
As part of the ongoing messy relationship between Google and the publishing industry, Google had become frustrated with what they saw as overreach by negotiators who put a value on news.
In June they commissioned a study to show the impact on their revenues of removing news, it ran from November for ten weeks or so and in their press release last week Paul Liu the Director, Economics at Google stated: “European news content in Search has no measurable impact on ad revenue for Google.”
It is clear from the release that this will become a familiar claim to regulators and the public, and so the study that sits behind it deserves a good look.
Is this a reasonable claim to be made from the statistical work that has been done?
Headline findings of Google study
Here’s the headlines of the test: Google selected eight countries to be be part of the test, namely Italy, Spain, Poland, Netherlands, Belgium, Denmark, Greece and Croatia. Those countries have over 190m citizens and make up about 43% of the EU population.
In those countries two 1% samples were chosen and run as a test and control pairing, so potentially three to four million individuals were involved.
All ‘press publications’ from the EU were identified. This would include over 13,000 designated European publishers.
For the test group, from 16 November to the end of January, these sites were disappeared from Google.
The experiment measured the impacts when compared to the control group on Search Daily Active Users (DAU), Search Advertising revenue, DAUs from Google News and Discover, plus Overall Google revenue (including Maps, Travel, Shopping, YouTube, 3rd party display ads).
An impressive study, but with some shortcomings
As Random Control Tests go, this is pretty impressive.
I have no doubt that the technical skill with which the test was run was high level, that the sample size was material, and that the integrity with which the science was applied was likely beyond reproach.
A few things to note before we get to the results though, speaking to the design context of the study.
Definition of the services is very wide. A Search Daily Active User is recorded when a user in the study touches any Google search service anywhere, once, and Google is the default search service on almost all smartphones.
Search is defined in the study as all web, news, images, video searches.
Revenue from Search includes the take across all markets, so news searches but also shopping, entertainment, financial services, travel, health.
Markets selected excluded the two biggest EU digital advertising markets with the biggest press sector, namely Germany and France.
IAB numbers from 2022 have the eight selected countries at 35% of the digital ad market of the EU, compared to Germany plus France at 43%.
Customers were not notified of the change (as with most Google tests). This is important for behavioural change studies – it was never clear to the study participants that they would not find local sources of news in their searches.
News providers available for those searches would still have included publishers from outside the EU including the US and UK, or aggregators like Facebook. If the news cycle from November through January was dominated by the US election, a great deal of news searches would have been fulfilled from familiar international providers.
The study notes: “The top click gainers in the experiment were youtube.com, infobae.com, facebook.com, wikipedia.org, and pinterest.com” but doesn’t quantify it. Infobae for reference is a Spanish language aggregator site with a dedicated Donald Trump section on its homepage.
Timing of the study covered Black Friday through to Christmas and then the January sales period, covering the highest revenue peaks for Google in non-news revenues. This is true for both the control and the test cells, but will likely have an effect on DAU and revenue metrics that is not controlled for, particularly in total revenue comparisons.
Revenue attribution is usually incredibly complex – imagine all of the permutations of revenue across all Google properties at this scale for ten weeks of trading.
To run this study there is either a) a perfect proven internal model of revenue attribution down to an individual user level or b) a large set of assumptions and allocations applied. If it’s a) then hat tip to them, but it is much more likely b) and no more information is supplied on that.
What does Google’s study actually show?
Comparing the test 1% to the control 1% Google reported:
Across the total experiment, Daily Active Users in Search fell -0.77%, Discover was – 5.47% and Google News was +1.54%. So there was a very slight drop in general search usage and a lightly bigger drop in Discover where users had personalised their interests.
Ad Revenue from Search was +0.02%, Discover -2.03%, and +0.13% in Other Google. So there was no material change in overall ad revenue, and a slight drop in revenue from personalised sources.
It’s an intriguing result, for sure. Based on the context of the experiment I think I would paraphrase Paul Liu to say that I think the results demonstrate:
“Removing [some sources] of news content for [1% of customers] in Search [without notifying] those customers has no measurable impact on [overall] ad revenue for Google.”
News is not the main revenue driver for Google, and that search as a category is much wider than news. Google revenue is disconnected from direct news provision, which has been evident for some time in their SEC filings.
I’m not sure any of these points are contended by the news industry.
The study demonstrates that Google is able to degrade the search experience in the news category and see limited impacts on its overall revenues. This is one of the classic tests of a dominant market position, that you can make your product worse and continue to rent seek.
Again, I’m not sure that the market position and impact is contended by many, even within Google.
The study does suggest a higher level of substitutability than expected between local news and international/aggregated news sources, but we don’t have the direct traffic performance from the affected publishers for the test and control cells to know that for sure. This is worth coming back to later.
What is the point of this study?
Google has incredible access to this type of experimentation, and is skilled at presenting the facts that suit their case, then in secondary commentary extrapolating that into their negotiating arguments.
Google argues consistently that news has little value to them, and so this study will be referenced a lot in the coming months.
This approach will use carefully calibrated language to wilfully argue what I think of as the centipede fallacy. Apologies in advance to insect lovers everywhere for this explanation.
If a centipede loses a leg, it can still get about its day to day business. This does not mean that legs have no impact on their movement. Remove all 100 legs and you have a sedentary centipede. The system effects are very different from the sample effect.
In the case of the Google study, the 1% sample is one centipede leg worth of news. The cumulative systematic impact of applying this to all news would likely be different because:
The policy would be permanent and news organisations would adapt their distribution strategies as a result.
The policy would be public (it would be very well reported in news media) and customers would adapt their news search strategies as a result, particularly in jurisdictions like Germany and France where pressure from US companies to bend the knee is not well received.
The policy would likely be challenged by both Apple and Samsung who pay Google billions to be the default general search engine on their mobile phones – can that contract be fulfilled without simple national news queries being serviced?
The policy would give an advantage to search competitors who would offer news when Google did not.
So on that basis the 100% cumulative impact looks very different to a 1% surreptitious delegging experiment.
We should be quick to point this out any time that this experiment is used to infer that a partial removal for 1% of users shows that ‘news has no value to Google’ is a logical leap that cannot be made – and could never have been proven by a 1% random control test.
The lesson for publishers here is stark though – any individual publisher being suppressed has no impact and no leverage. The strength at the negotiating table comes in understanding the systemic industry level effects.
Key take-homes from Google’s value of news study
In summary I would take this away:
Google ran a serious but private study which gave an interesting result on a sample size of unknowing users. It was designed with a methodology and with market conditions that mean the impact of all news on Google’s revenues cannot be inferred from the study itself.
Publishers will continue to argue for a valuation that reflects their role in the information and cultural landscape. Google will continue to argue for a low valuation that reflects their desire not to incur new costs.
Onto the next round.
Email pged@pressgazette.co.uk to point out mistakes, provide story tips or send in a letter for publication on our "Letters Page" blog